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FIELD OF THe INVENTION 

The present Invention relates to the eiutomated analysis of dlgM images. It Is 
mons partioulaily concerned with the automated Identification of mltotiq activity In 
6 digital im^iee of hietoloaieal or cytology epetsmene and most particularly for the 
puipoee of assessing ttie presence and severity of cancer in breast tissue, and It Is 
In this context that the invention Is principally dasdftjed herein. The invention 
may. however, also find appllcattan in the assessment of otfier fonria of cancer, 
such as colon and oen/ical cancer, and in the analysis of various other kinds of 
10 structure presenting Image components which are amenable to IdenttfleaHon in a 

similar way, for example In the analysis of soil samples containing certain types of 

seeds or other particles. 

BACKGROUND ANP SUMMARY OF THE INVENTION 

15 

Many thousands of women die needlessly each year from breast cancer, a cancer 
from which them fs theoretically a high probability of survival If detected sufficiently 
early. If (he presence of can«»rt>us tissue is missed In a sample, then, by the time 
the next test is undertaken, the cancer may have progressed and the chance of 
20 sunrival significantly reduced. The Importance of detecting cancerous tissue in the 
samples can therefore not i^e over-emphasised. 

A typical national breast soreenlng programme uses mammography ibr the early 
detecUon of Impalpable lesions. Once a lesion Indicative of breast cancer Is 

26 detected, then tissue samples are taken and etiamined fay a trained 

hialopathologlst to establish a dlagnoslB end prognosis. More particularly, one of 
the principal prognostic fectors fbr breast cancer is the extant of mitotic acHvity, 
that is to say the degree of epithelial cell dtvlslon ttiat is taking ptece. A 
histopathological slide is effectively a "snapshot" representing a veiy short tima 

30 Intetval In a cefl division process, so the chance of a particular slide showing a 

particular phase of mitotic activity Is very small; If such a phase is in feet present In 
a slide, that Is a good Indicator of how fast a potential tumour Is growing. 
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in the existing manual pmcedurs for scoring mitotic atsthrfty a histopathotogist 
places a $lide under a mierosoope and examines a region of it (refened to as a tile) 
at a magnification of x40 for Indlcatione of mitoses. TVpicany ten different tites 
fh>m the tlBsue sample are examined and a total count is made of the number of 
ceil divisions which. In the histopathelogtsf a opinion. Bi^semlobe taking place m 
the ten tiies. This is then converted to an indication of cancer grade typically in 
accordance vAth the fottowing tebte: 

Number of Mitotic Ceiia Cancer G^e 

per Ten Tiles 

(N + DtolW Grade2 

Grades 

Where Grade 1 is the bast serious and Grade 3 is the most serious. Values of N 
and M ana typically 5 ami 10 but will vary in different schemes depending on the 
size of the ties being obsewed. 

This ts, hovimr. a time consuming, labour intensive and expensive process. 

Qualification to perfonn such examin^'on is not easy to obtain and requires 

frequent review. The examination itself TBquIres the interpretation of colour 
images by eye. a highly sul;>iective process characterised by eonsideiabie 
variations in both inter, and Intra^bseiver analysis, l.e. variances in obsewation 
may occur for the same sample by different histopathologists. and by the same 

histopathologist at different times. Forexample. studfes have shown that two 
different histopathologists examining the same ten samplea may give different 
opinions on fhr«e of them, w enor of 30%. This problem is exacerbated by the 
complexity of some samples, espaoially in marginal cases v»hem there may not be 
a definitive conclusion, if sufficient trained staff are not available this impacts 
upon pressures to complete the analysis, potentially leading to eironeous 
assessments and delays in diagnosis. 



These problems rtmn that there are practical limitations on the extent and 
^«^'^™»fisaf8oreenin8forbreastcancerwiththeccnsequenoethatsome 
30 ««>men are not being coirecfiyldenfiRed as having the disease and. on some 
occaafons. thisfeflure may resuHln premature death. Com/ersely, others are 
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being incorrectly diagnosed with brea$t cancer and sro thererora undaigoing 
potentially traumatic treatment unnecessarily. 

It is thue an aim of tiie invention to provide an automated method of image analysis 
5 which can be embodied In a robust objective and cost-eflective tool to assist in tfie 
diagnosis and prognosis of breast cancer, aithough as previously indicated the 
invention may also find application In oth^ fields. 

In one aspect the invention accordingly resides In a method for the automated 
1 0 analysis of a digital Image comprising an anray of pixels, including the steps of: 
identlf/ing the locations of objects within the image which have specified intensity 
and size characteristics; defining regions of specified extent within the Image 
which CQnlarn respective said objects; deriving from the data within mspectlve said 
regions one or more respective closed contours comprising points of equal 
1 5 ffitensities; and estfmafing the curveture of at least one respective wid contour 
wflhin reapeoHve said regions at least to produce a measure of any concavity 



As win be understood from the ensuing detailed description of a preferred 
20 embodiment, such a method is of use in identifying mlloBc cell nuclei in digital 
images of histopathologlcal slides. 

The invention al$o resides In apparatus for the automated analysts of a digital 
Image comprising means to peri'omn the foregoing method and in a computer 
25 program product comprising a computer readable medium having thereon 
computer program code means adapted to cause a computer to execute the 
foregoing method and in a computer program comprising instmctfons so to do. 

These and other aspects of the invention will now be more part:iculariy described, 
30 by way of example, with reference to the accompanying drawings and in the 

context of an automated system for grading cancer on the basis of the numbers of 
mitotic epithelial cell nuclei in digltel Images of histopathologlcal slides of potential 
rarcinomaa of the breast. 

35 BRIEF DSSCRIPTION OF THE ORAVHNGS 



thereof. 



In the drawings: 
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Rgure 1 is a block diagrarrt of an automated process in accordance \with the 
inventim for measuring mitotic activity for paSent diagnosis; 



5 Figure 2 is a more detailed biacit diagram of ttie main stages in the.mitosis 
detection and measurement biod< of Figure 1 ; 



Figures 3 and 4 are simplified visualisations of tiie contour selection stage of the 
process of Figure 2; and 

10 

Rgure 5 illuetrat^ a dedsbn bouncteny In a Fbher classifier as used in a later 
sfs^ of the process of Figure 2. 

DETAILED DESCRIPTION 

15 

Figure 1 shows a process for the asaeasment of iissi» samples In the fbrm of 
histopathoiogloal sOdee of potential carcinomas of the breast The process 
measures mitotic activity of ejdthallal cells to pnaduce a parameter for use by a 
pathologist as the basis fbr assessing patient diagnosis, ft employ a database 1 , 
20 which maintains digitised image data obtained from histological sKdes. Sections 
are cut fnm breast tissue samples (triopsies), placed on respective slides and 
stained using the staining agent Haematoxyiln & Eosin (H&E), which Is a common 
stain for delineating tissue and oellular structure. 

25 To obtain the digitised image data fbr analysis, a histopathdogist scaris a slide 
under a microscope and at 40x magnification selects regions of the slide which 
appear to be most promising in terms of analysing mitotic activity. Each of these 
regions is then photognsphed using the microscope and a digital camera, in one 
example a Zeiss Anoslaap mica-oseope has been used with a Jenoptilcs Progres 

30 3012 dqptal camera. This produces for each region a respeeSive digitised image In 
three colours, i.e. red, green and blue (R, G & B). Respective intensity values In 
the R, G and B image planes are thus obtaned iter eaCh pixel in an anay. in the 
preferred embodiment there la an eight-bit range of pixel intenaltlea (values of 0 to 
256) for each colour and the annay comprises 1476 pixels across by 11 60 pixels 

35 down. The Image data to stored tempomrily at 1 tor later use. Tendi^itfeed 
images (electronic equivalents of tlies) are required for the detection and 
measurement of mllofic activity at 2. which then provides input to a diagnosUo 
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report at 3. In prindple the proo9$sIng stages to be described in detail below can 
operate on a single waveband (R, G or B) fmage or a combination of them- In 
practice, however, ^e red waveband ha^ been found to contain the most 
Information for dfecriminatlng between mitotic and other cells when stained with 
H&E, and is assumed to be used in the following description. The prooe^ <^n be 
perlbmied in a suitably programmed personal computer (PC) or other general 
purpose computer of suitable processing power or in dedicated hardware. 

Figure 2 shows in more detail the processing stages comprised In the Wock 2 of 
Rgure 1 , They are candied out for each of the ten digitised Images referred to 
above and will now be described for one such Image (tile). To aid in the 
understanding of this description it is recalled that the aim of the process Is to 
Mentify and count the number of mitotic eptthelial ceil nuclei (tf any) in each tile. In 
images aoqidred as described above such nuclei generally appear darker than 
hormai epithelial cell nuclei, and also have a dHTerent shape. Normal nudel are 
generally convex ViAh smooth boundaries while mitote nuclei are more Irregular in 
shape and have ragged boundaries. However, it Is not always the case that 
mitotic epithelial cell nuclei are the daricest objects m a given ttle; for es^mple 
stromal cells, lymphocytes and rwrotic cells may be dartcer. Given the relatively 
low numbers of mitoses which may be present In any given tile and yet may 
indicate serious disease it is important that as many as po^lble are correctly 
identified while at the same time minimising the number of any nomial cell nuclei or 
other objects incorrectly identified as mitotic. 

Loeatloii of candidate cell nuoiel 

Referring now to Figure 2, the first processing stage 21 consists of locating all 
possible candidate cell nuclei. The approach adopted for identifying the locations of 
potential mitotic nuclei is based on the fact that they are generally darker than 
average nuclei* IVIitotic nuclei appear in the image as solid dark objects fl.e. daric all 
the way through) most of the time, or instead occasionally they tomi groups of 
small dark dumps. Hence the aim is to find concenb^tions of dark pixels; these are 
not neoessanly connected groups of dari<c pixels, but a region containing a sufficient 
number of cFustened dark pbcels. 



There are various methods of doing this. One example Is Simple grey-level 
segmentation, where a thnsshold is diosen end only those pixels having grey- 
levels below this threshold are selected. The drawback of this approach Is that 
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dome mltoUo nucl^ are not particularly darK, but am only d{stingui$hable from their 
. shape characteristics* Cheesing a threshotd sufficiently low to detect such nudel 
would yield an excess of clutter. 

The preferred approach \s to use the multiresolution blob filtering described below. 
5 However, as will be apparent to those skilled In the Image processing art, the 
present invention nnay be practised without ennpioying this particular technique. 
Alternatives include the processes described as mitotic cueing in our copending 
United Kingdom patent application no. 0226787.0, The general prvitiple ls» given 
that the approximate size of file nuclei is known, to apply a radially-symmetric fitter 
1 0 Whose output is large in magnitude vdien there Is a region of local brightness or 
darkness whose shape and size approximately matches that of the filter This fUter 
should be a difference filter with zero mean, so areas of constant lnt0nsity are 
suppressed. 

The method will now be described In terms of a specific implementation, namely a 
1 5 multi-scale blob filter as known e.g. from "Muitiresolutlon analysis of remotely 

sensed imagery*, J.G Jones, R.W.Thomas, P.G.Eanwicker, Int J. Remote Sensing, 
iggi, Vol 12, No 1 , pp 107-124. The process will be described for filtering the 
Image using a particular ^ze of blob filter, where these are defined at successive 
octave (powers of 2) scale sizes. 

20 The recursive construction process for the multi-scale blob filter Involves two filters; 
a 3x3 Lapladan blob filter {L) and a smoothing filter (s) as defined below. 
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These two fiitei« form a basis for fUtsrlng over a set of octave scale aizas, 
according to the following process: 

25 Ta enhance blob-shaped objects at the original resolution (octave scale 1, pixel 
si2se 1), the Image is correlated vtdth the 3x3 Lapiacian filter alone: 

t I 
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where I is the original Image, and the range of the indices m and n is set such that 
the Indices In the summation at>ove are always within the origlnai image 
dimensions (so m and n start at 2). Values of the filtered Image at locations outside 
these ranges are set to zero. 

6 For computational efficiency, multiplications by ±1 need not ba performed expHcttly. 
Thus the filter output value for a pixel located at postHon (/,/) la given by: 

To enhance blob-shaped obje^ ai a resolution one octave above the onginal 
10 (octave scale 2, pixel size 3), the Image is first correlated with the 3x3 smoothing 
filter is), forming a smoothed image (Sg). The .3x3 Lapiacian blob filter (i.) is then 
expanded by a factol" of two, by padding it with zeros, to form a SxS filter [-1 0-10- 
1; 0 0 0 0 0; -1 0 8 0 -1; 0 0 0 0 0; -1 0 -1 0 -1]. This te then correlated with the 
smoothed Image (S^?) to form a filtered Image (Fs), but for computatlonareffidency, 
1 5 only the non-zero filter coefficients are Used, thus: 

(w* Z ii ( « -I- 2 i , « + 2y ) * + 2, / + 2) 

where / is the original Image, and the range of the indices m and n is set such that 
the indices in the summation above are always within the original image 
dimertsions (so m and n start at 4). Values of the flitered image at locations outside 
20 these ranges are set to zeroa 

The above double correlation is equivalent to a single correlation of the original 
image with a 7x7 fiHer formed from oorrBlafiro the expanded Sx5 Lapladan with the 
3x3 smoothing filter, but this larger filter is never fonmed explicitly. 

To enhance blob-shaped obiects at a resolution two octaves above tiie original 
25 (scale 3, pb^el size 7), the smoothing fBter is expanded t>y a factor of 2 in the same 
manner as the Lapladan above, then cornslated with the smoothed Image (S^ 
above to give a lowar-rBsoiution smoothed Image (Sa) thus: 

.S3(w. w) ^ 2 + 2f,«+ 2jF)* j(f + 2,/+2) 
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Following this, the 5x5 Lapladan filter is expanded by a factor of 2 by padding with 
zeros to form a 9x9 filter, which is ponretated with the smoothed Image {S3) (n the 
same computationally efSdent manner, thus: 

5 This process Is repeated to obtain results at successive octave scales, namely 
expanding both the smoothing filter and the Lapladan blob fitter aach tinne. 

The above process may be used to produce a ^'blob^Htered* image at any of the 
required octave scales. Objects of Interest (i.e. clusters of dark.pixete In this case) 
will have the greatest values of the ms^nitude of the filter output. The locally 
. 1 0 stnsngest filter output will occur at the centre of the object of Interest. Individual 
objects (called ''blobs'^ are now Identified by finding local minima of the filter 
output, where each blob is assigned a position and Intensltyp the latter being the 
value of the fDter output at the local minimum. 

In this applicationi oiiyects which ar^ dark relative to the bacl^round are identified 
IS by finding iocal minima of the filter output at one chosen scale. In this Instance 
octave scale 5 (a nuclear size of 31 pixels across). 

For computational effidency. the spatial resolution of the Image is reduced prior to 
blob filtering, using the following thinning method. Each reduction in resolution by a 
lector of two (a "thin") Is achieved by firstly con-elab'ng the image with the 3x3 
20 smoothing fitter (a), then sub-sampling by a factor of two. The formula for a airigle 
thin Is: 

Timn) = 2 S^C^'n + f52K + ^ * ^(1 + 2,^ + 2) 

where / is the original image, T is the thinned im^e, and the indices m and n 
rar>ge from 1 to the dimensions of the thinned image. Each dimension of the 
25 thinned image Is given by subtraetting 1 or 2 ftam th$ oom^pondlng dimension of 
the orfginai image, depending on whether this is odd or even respectively, then 
dividing by 2. 



In this instance, the image la reduced in resolution by a factor of 4, by applying the 
above process twice, firstly on the original Image, and then again on the resulting 
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thinned fmage, to produce & new image whose linear dimensions are a quarter of 
the size of the original (area \& 1/16). For example, an original image of a tile of 
QiTe 1476)^1130 pixels would become 368x289 pixels. Blobs are now extracted as 
describe above from the reduced-^nesolutlon image at octave scale 3 (7x7 pixels 
across), this being equivalent to extracting ^cale 5 blobs from the original image, 
but being more computationally efHcient. 

This process identffles all obtiedB which fbnri dark duatens of the requisite size, 
which may include not only mitotic epitheilal cell nuclei bi^ also badcground clutter, 
stromal cells, lymphocytes and/or necrotic cells, plus -folnter normal eti^elial riuclei 
which are not of interest. Since it i^ known that the mitotic nuclei are likely to be 
much darker than average, only the darkest 10% of blofc^ are selected for farther 
analysis. This Is achieved by sorting the blobs into ascending order of filter output 
(so the darkest occwfirBQ. using a Quicksort algorithm (such as described in 
Kletts R., Zamperoniu P., 'Handbook of Image Processing Operators*, John Wiley 
& Sons, 1996)^ finding the 10^ percentile of the sorted values, and diooaJng all 
blobs darker than this percentile. 

Segmentafion and first clutter rejeotion 

The next processing stage 22 aims to And an approximate s^menMion crf^e 
image, to separate regions (defined as connected sets of pixels} potentially 
assodated mth the celi nuclei of interest, from the hadcground. The ffulKsized 
original image (red component) is used at the commencement of this stage. 

Firstly, a grey-level threshold is selected. TTiis Is achieved by choosing a set of 
lSx1S pixel neighbourhoods centred on each of the blobs selected at the end of 
stage 21 collating all pixels within all these neighbourhoods into a single list, end 
computing the mean greyHlevel of these pixels, 

A new thresholded fcdnary image Is now produced. PlxsH& in the red component of 
the original image whose grey levels are beiow (darker than) the thr^hold mean 
computed above are set to 1; remaining pixels are set to 0. 

Connected component labelling Is now appRed to this binary Im^e. This is a 
known image prooessir^ technique (such as desonlMd in A Rosenf^id and A C 
Kak, ^Digital Picture Processing', Vols. 1 & 2. Academic Press. New York. 1982) 
which gives numerk^l latels to connected regions in the binary Image, these being 
groups of connected pb^a whose values are all 1 . An S-connectedness rule is 



17-DEC-2002 15:50 FROM INTQJLECTLJPL P R OPE RTY TO 001633B14444 

used, so pixels are deemed to be connected when they are horizontally, vertically, 
or diagonally adjacent. Eadi region corresponding io a selected blob from etage 21 
15 assigned a sepan^te label, enabling pixels belonging to those regions to be 
identified. The following region properties are then computed: 

5 Area - number of pbceld wKhln the region 

Thioknesd = minimum thidcness of tfie region, defined thus: for each pixel In 
the region, find the minimum distance from that pixel to the outside of Ihe 
region- Thickness is then defined to be the maximum of these minimum 
distances. Note that thickness is not the seme as width; for a rectangle the 
10 thickness is half the width, and for a drcle the thickness is tiie radius. 

Regions whose area Is less than 190 pbols or whose thickness is less than 4 
pixels are rejected, these being too small to be mitctfc cell nudeL 

At this stage the mean greyHevel of the pixels within each region Is also calculated 
from the red component of the original rmage. The overall mean and standard 
15 deviation of these mean grey-tevels b then found for later use In grey-level 
normalisation (stage 25). 

Cdrttour serecflon 

The next processing stage 23 incorporates two levels of contour selection to gain a 
better representation of tfie actual shape of the toundary of each remaining object 
20 at both low and high resolutions. Rrstly, a iow-resoluifon (large-scale) contour Is 
computed, which gives an approximate shape, and secondly a high-resolution 
(small-scale) contour is found vrfilch gives a more acxsurate boundary 
representation. Following consistency checks between the two contours* attributes 
of the boundary are then n^asured from the small-acale contour. 

25 For each of the objects remaining after stage 22, a local region of interest (ROi) is 
selected. This ROI is centred on the nominal centre of the object (as found in stage 
21). and has an extent of 50 pbeeis in each direction, the region size being 
truncated as necessary to msure the ROI lies within the bounds of the original 
image. This allows ROIs which would othenmse overiap the edges of the tmage to 

30 be included. Alternatively the ROIs could be defined by taking the felons 

identified in stage 22 and adding a border of a selected number of FHxels. In either 
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cdse, tt is deslrabfe that the ROIs exceed the sIzb of those regions somewhat in 
order to ensure the generation of the low-resolution oontoure. 

To find a low-resolution representation for the boundary of each object, the region 
defined by the ROI above is used to define ia sub-image within the output of the 
blob filter (stage 21}.This sub-ims^ will consist of both positive and n^atlve grey 
levels. Contours at two levels within this sufanmage are then sought, namely at 
levels 0 arid -1Q whicA have been found to be best experimentally. By virtue of the 
operation of the blob filter in stage 21, the zero level contour in the raspecBve sub- 
image IS Uiat contour which exhibits the highest edge strer^h. A contour is a 
curve consisting of points of equal value for some given function; (n this c^e the 
function is defined by the grey-levef pixel values, in ihh embodiment, the Matiab® 
contour function is employed but any contouring algorithm can be used which 
returns contours in the same form, as a set of contiguous points ordered around 
the contour; (Matlab*^l5 a well l^nown compuiational tool from The MaihWorlcs, 
Inc.). Matlab^ returns a set of loc^rons with sub-pixel resolution whidh are in order 
. of location around the contour, i.e. traversing the set of locations is equivalent to 
walking around the contour. Contours are only treated as valid if they satisfy all the 
following four conditions: 

- liiey fbnn closed loops within the ROL i,a the last contour point is the 
same as the first contour point; 

- they are consistent with the location of the object (there Is at least one 
contour point whose distance from the nominal centre of the object is 
less than or equal to 30 pixels); 

" they have a sufficiently similar area to the ""nominal area" found from the 
grey-level segmertiation computed in stags 22 (the definition of Vrm area 
within a contour Is given later in this section}. The contour ansa must be 
at least 50% of the nominal area; 

- they have the coned grBy^evel orientation, namely pixels within the 
contour are darker than those outside the contour. 

The object is retained for further analysis (maintabied fn list in cxsmputer) only if a 
valid contour Is found from at least one of the two contour levels (0 and -'lO), if both 
contour levels yield a valid contour, then the latter one (-10) is chosen for further 
use. 



m 



17-DEC-2002 15! 50 FROM INTELLECTUflL PROPERTY TO 001S33S14444 P. 19 

12 

To find a high-resolution reprBsentaHon. the region defined by the ROI above te 
tateen out from the red component of th© original Image to form a sub-image. 
Contours are not extracted from frie imaae at Its original resolution, because these 
have been found to be too rough. Instead, the resultmg sub-Image is expanded in 
5 size by a factor of two, using the Matlab® bilinear inteipolatlon function, to give 
additional resolution. In bilinear Interpolation, to find the values of a selected point 
not on the ortginal image grid, tls four nearest grid points are located, and the 
relative distances from the selected point to each of Its four neighbours are 
computed. These distances are used to provide a weighted average for the grey- 
1 0 level value at the selected point 

This interpolated image is then smoothed before oontouringp by comelating (sea 
earEer description witfi refer^ice to stage 21} with a 3x3 smoothing filter (s) 
defined thus: 

n 2 n 



I 

5 as 

16 



2 4 2 
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Valid contours are then sought at each of several threshold levels. The range of . 
threshold levels starts at the mfriimum grey-level within the sub-image, and 
Increases up to the maximum grey-level In steps of 10 grey levels, so the actual 
contour levels are set adapiively. Valid contours are defined in the same manner 
20 as for the low-resolution boundary above. 

Having found a aet of vaiki contours at ea^ threshold level, the edge strength at 
each point on the contour is estimated. The edge strength at each image pixel is 
defined as the modulus of the vector gradient of tiie ordinal red component of the 



—a— It Where the two 

25 partial derivatives are obtained from takir^ differences In pi)^l grey4evel values in 
the X and Y dvecQons respectively* The edge strength at contour points which lie 
between Image pixels is estimated using bSnear Interpolation (as described above) 
from the nearest pixel values. The mean edge strength around the contour is then 
computed. The contour having the greatest ec^e strength is chosen as being the 

30 most representative of tfie boundary of the objecL If no valid contours ana found, 
the object Is rejected. 
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A consistency check between the tow-resolution and hlgh-resolutJon contours Is 
then performed. The area wfthirt each contour Is computed from the boundary 
contour using Green's Theoram (such as described In Chap 6 In Vector Analysis 
and Cartssidrt Tensors', D.E.Boume and P.C.Kandal. ^ ed. Nelson, 1977). This 
gives the following formula fbr area: 



where (;c„;',) are the contour points. 
These areas are ttien suisjetSed to the followir^ tests: 
Hlgh-reBOluOon area > O.S'low-resolutlon area 
10 High-rBSOIuQon area < 1 .4* low-resoiution area 

If either of these tests fail, the object is rejected. 

Rnally, there Is a check that the lott^ and hlgh-reaolutfon contours for each object 
overlap suffitiently. For each object, tyro bineny Images are formed. The first trinary 
image Is formed by setting the value of pbsels which lie within the Icvtf-reBoUrflon 

1 5 contour to 1 , and those pixels outside the contour to 0, using the Metlab® ftinction 
rojpoft'. Tha second binvy Image Is formed in the same way from the hlgh- 
r^olution contour. The absolute difference of these twra images Is taken, resulHng 
In another binaiy image In which pbcels are set to 1 if and only if they ile writh6i one 
of the contours and not the other, and 0 olhenwise. Connected component labelling 

20 (see stage 22 description) Is applied to this new binary image to Identiiy separate 
legions. The thickness of each of these regions Is computed (as in stage 22). If any 
region thickness exceeds 5 plxsla. the conpesponding ofctject is rejected. 

Simplified visuaRsaifions of tha effects of the stage 23 processing are shown in 
Figures 3 and 4. 

26 In Figure 3(a) a local region of inteitest 30 te defined around the nominal centre 31 
of an object 32 wrftldi Is represented in this Figure by a series of eontoure. A 
second object 33 also ap^ars in the same sub-image. Rgure 3(b) illustrates tiie 
low-resoiution boundary contour 34 computed for the object 32 and Figure 3(c} the 
h^ivresolution tmundary contour 35 computed for the same. Figure 3(d) 

30 iilustratae the overlap between these two resolutions with tha region of difference 
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36 shaded. In this ea^p th^ areas wthin the contours 34 and 35 are sufRdently 
similar and the thiekndd« of the r^lon 36 Is sufficientty small to pass all the above- 
mentioned consistency checks and the otaert 32 wilt be retained. In effect these 
checks are showing that the oli^ed: Is sufflclentiy unrfbrmly dark to potentially 
5 represent a mftoOo cell nucleus. The oWect 33 will riot be treated as valid for the 
ROI 30 because Its contours are not consistent with the centre 31 . It will, 
however, be separately analysed wHhin a separate ROI (not shown) defined 
around Its own nominal centre. 

In Figure 4(a) there te another example of a local regron of interest 40 defined 
1 Q around an object 41 . Figure 4(b) Illustrates the tow-n^solution boundary contour 
42 computed forihis object and Figure 4(c) the high-resolution boundary contour 
43. In this case the area of the contour 43 la substant'^lly less than that of the 
contour 42 (approximately 0.4) so it fails the first of the above-mentioned 
consistency checks and the object 41 will not be processed further. This Is 
1 5 Indicative of a nonnal epithelial cell nucleus which has a rela^ely darker nucleolus 
(chosen as the high-resolution boundary because of its high edge strength) 
surrounded by a less dark region (chosen as the low-resolution boundary). 

Boundary tracking 

JhB next processing stage 24 applies a tracking algorithm to the high-resolution 
20 contour representing the object's boundary for each object retained from the 
previous sisge 23 in order to estimate curvature. The aim is to smooth the 
boundary and then measure curvature, because simply calculating curvature from 
the contour segments gives too rough a measurement. For the identtncatlon of 
mitotic cell nucl^ the degree of norHQonve)dty of the boundary Is of interest, so the 
25 latter method of palortation is Inappropriate. 

The particular algorittim whtoh has been used in the prefemed embodiment is 
based on a Probability Density Associaticn Rtter (PDAF), such as described in 
Y.Bar-Shalbm andTE.Fortmann, bracking and Data Association". Mathematics in 
Science and Engineering series, vol 179, Orlando, Fl., Academic Press, 1988. This 
30 type of tracking algorithm is designed to estimate the parameters of a chosen 

object (target) in the presence of other mes^uraments which are not reterted to the 
object In question (noise and clutter). In this case.the largef state variables are the 
positioa orientation and curvature of the object's boundary, and the measurements 
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ai« the positions of the contour polnte and the orientations of the lines joining each 
pair of contour points. 

The PDAF filter requires a model for the dynamics Of the boundary state. The 
boundary dynamics is given by a constant curvature (the radius of cun/ature is set 
5 to 10 pixels) plus an assumed random perturbation known as system nolae. This 
noise Is determined by the variance of cun/ature perturbation, which is chosen 
according to how in^guiar the boundary of a mitotic ceil nucleus is expected fo be. 
in the prefen^d embodiment the cunrature variance is 9 for position and 0.09 for 
angle On radians). 

10 As a starting point, it fe assumed that for each object potentlaBy representing a 
mitotic cell nucleus a connected set of edge features has been extracted from the 
image. In this case, edge ffeatures are line se^ents joining two adjacent contour 
points. Each edge feature has the following measurements that were made as part 
of the contour extraoHon process: 
15 « Position x^ym (horizontal and vertical image coordinates) of tfie centre of 
the edge 

« Orientation flb, l.e. the angle behween the edge and the horizontal. 

The purpose of the tracker is to estimate the most likely tme location, orientation 
and curvature Xs, y» 0» xr of the boundary at each point from the above 
20 measurements, given that there ere measurement enrors wHh an assumed 
Gaussian distribution. The following inftwmation vectors are def&ied: 

• The measurement vector z » (Xm Ym, 
« The system state vector x = fxs, /s. Qs, k;). 

To use the PDAF filler to do this, the following infomnatton about the true boundary 
25 and the measurement procero Is reqttfred: 

* The relationship between th© position, orientation and cun/ature of 
neighbouring points on the boundary (the system modeO- This inoorpoiatSB 
a transition matrix <t> linking neighbouring states x and a system noise 
model that adds extra nsindom pertiirtiations to x. 
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* Tha relationship between the meaaurement vector z and the syetem state 
X. This incorporalBs a transHion matrix H linking ;f to z and a measurement 
noise model that adds extra random perturbations to r. 

• It Is assumed thst not all of the edge features are associated wJth the 
nuclear boundary; the ones that are not are denotad dutter. 

In its most general form tha PDAF processes several measurements z at each step 
In esfimatinfl x. In this case only one edge feature is processed at a time, so there 
are only tuo hypotheses to be tested; either the feature is from olutler or from the 
real nudear boundary. 

The system transition maWx ® Is based on constant cunrature, so to predict a 
neighbouring system stats the unique circle or straight line wrflh curvature k, • 
tangent slope 0a going through the point Xs, Ys Is ejctrapoteted to the point that te 
closest to the next m^urement point. 

The system noise has a Oaussian distribution with zero mean and a covariance 
matrix based on Independent perturbations In cunrature, orlentstlon and lateral 
offset (movement In a direction normal to the boundary). A Brownlan model Is 
used, where the standard deviations of perturbations in curvature, orientation and 
lateral offset are proportional to the square root of the arc length of the 
extrapolated cirxrfe of ttie previous paragraph. The accumulated effect of curvature 
perturbation on orientation and lateral offset is also modelled, resulting in the 
following covariance matrix: 





f ' 








ro 


0 


0 ) 












0 




0 














0 





with respect lo the system cunrature system slope 6 and lateral offeet 
iBspedively, v»*«ero s Is the aitt length (the drcular distance belMveen the previous 
IMint and the esfimate cf the next point). The constants tn,, Oss. o^r define the 
average roughness of the nuclear boundary, and depend on the type of cell being 
analysed. 

The measurement transition matrix H maps the system paramatenB to the 
measurement parameters in the nabjrai way: 
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^1 0 0 o> 
0 1 0 
^0010^ 



The measurement noise Is based on Independently Gaussian dlstrllM4ed 
perturbations of slope and lateral offset, resulting In the follow/ing covarlance matrix 
with respect to measurement slope and lateral offeet respectively; 



The constants a)n4 oiv define the average smoothness of the nuclear boundary, 
and depend on the type of cell being analysed. 

The foltoviAig constants are used to define the clutter model: 

• p = Clutter densi^ = average number of edges per unil: area that are not 



a P0 « ProbabiBty that the edge feature Is associated with the true nuclear 
boundary. 

These constants depend on the clutter present in the Image, both iie ec^e Strer^ 
relative to the nuctear boundary and its average spatial density. 

15 For a nucleus wfth average radius rthe following parameters of the above model 
are used (all units are Image pixels): 

• oiH0- 0.3 radians 

20 * Ojq^-0.8 

• oto s 0,09 radians 




associated with the nudaar boundary. 
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• Po- 0.8 

The initial valua for the system covailanoe matrix M fe given by. 

0 0 



The PDAF estimator Is now applied sequentielly as Follows. The matrices H,, Q, 
and are all constant in this application (as defined above). The following 
expressions are those known from the Bar-Shalom and Foitmann reference quoted 
above. 



Update: 
INNOVATION 

KALMAN GAIN MATRIX 



BETA WEIGHTS 



eu/[b + S^^l for 15^0 



where e« -exp(-i^vSsiV« ) ferl^O. and b= p( 1.Pd)VI2"^S.I /Po2 



1 6 STATE ESTIMATE UPDATE 

&k=Xk+Kfev:k 



ERROR COVARIANCE UPDATE 
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Predictton: 

STATE ESTIMATE EXTRAPOLATION 
ERROR COVARIANCE EXTRAPOLATION 



Mk+i =*l>k P k^'t Q k 



This process continues until the entire contour Is traversed (i.e. returned to the 
starting point). This is then repeated around the contour for a second time (using 
the final conditions from the first pass as starting conditions for the second pass); 
10 thisensuresthatthefinalestlmateofthe smoothed contour is independent of the 
assumed initial conditions. 

-n« curvatuns of the smoothed contour derived fi-om tlne PDAF tracker Is now used 
to find a measure of the degree of non-conveaty of the obiecf s boundary. Rrstly, 
the sign of the curvature fe set so that It Is positive where the boundary is locally 

1 s corwex and negative where locally coneava (as viewed from outside the boundaiy). 
Afl posHlve values of cunfl»ure are ttien set to zero, leaving non-zero values only at 
locations where the boundary is locally concave. A graph of oun^ature (Y-axIs) 
against pertmeterarc length I.e. distance along the boundary (X-axis) is plotted, 
then the line integral of cuwature with respect to arc length is computed. The 

20 absolute value of this Integral is taken to produce a non-negative result. The final 
rHsujt Id a dimenalonless quantity giving an Indteatlon of overall non-convexity, 
called the "nsgatlve curvature area-. Objects which are almost completely convex, 
in tWs case whose negative curvature area Is less than 0.2, are then rejected. 
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The output from tilid proce^ a set of boundary meaeuremenid for each object- 
namely negative curvature area, and a mors precise estimate of area. 

Grey-level noimen^fion 

Next 9 normalisation process 25 Is carried out to allow for cJifferences in overall 
brightness between different sBdes. pQt each remaining object, the mean grey level 
of the pixels enclosed within (but not on) ttie high-resolution contour found In stag© 
23 is calculated. The statistics used for normalisation are the overall mean and 
standard deviation of the grey levels of the regions obtained from stage 22. Each 
object's grey level Is then nontiaHsed (by subtracting this mean and dividing by this 
standard deviation). The output is a statistic for each assumed niideus. 

Second clutter r^aefion 

The nejd process 26 involves a second stage of classification and clutter rejection 
based on the Fisher dgssJfier to discriminate between objects representing mitotic 
and non-mrtotic nuolai. The Rsher classifier is a Icnown statistical classification 
method described for example In Section 4.3.of "Statistical Pattern RecognlHon" by 
Andrew R. Webb, Arnold F^ress, 1999. and Is preferred for this stage of the process 
due to Its robustness against overtraining. 

In this case the Fisher classifier uses a set of information about each object that 
has been derived by analysis as described above. Eac^ information set is a feature 
vector, that is an ord«ed list of real numbers each describing some aspect of the 
objectt each component number is denoted an object feature. The purpose of the 
classification algorithm is to discriminate l»tween two classes of object based on 
the information- contained in thsir feature vectors. The output of the algorithm is a 
set of numbers, one for each objectj indiis^Hng the likelihood that the nudeus 
which it represents is a memt^er of one of the two chosen dasses (in this case the 
classes are mitotic and non-mitotlc). 

For a given feature vector x« the standard implementation of the Fisher dassiTier 
output Is defined as: 



JW 
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Where x=Uc,. . . J to the feature vector. In this embodiment, this deMon has been 
extended ta wse noivllnear functtona of the feature vector, namely; 

Where a, ar6 prescribed real numbers and are prescribed functions of the 
5 feature vector x. These fUncllonP and variables are chosen to give the lowest 
number of misclassificatlons for objects with known class. 

The components of the feature vector x are mean grey level (computed In stage 
25) and negative cuwature area {computed in stage 24). In principle area could 
al50 be used, since smaller mitotic call nuclei tend to be dai1«r and less concave 
10 than larger ones. In this case a quadratic set of baste functions (g*) are uaed, so 
that the Rsher classifier value Is given by: 

where G Is the normalised grey-level. C is the negative curvature area, and the 
coefficferits a, are derived fipom the training stage referred to below. 
15 ThecoetflcientsanddedatonboundaryfortheFlshBrclasslflerareobtalned by 
training the classifier on a large number of example sHdes provided by a 
histopathologlst where accureie ground truth (s^ of mitotic and non-mltotio cells) 
is also available. The training stage results In a classifies boundary which 
minimises the total number of mlsd^seBteatlons, l.e. both f^lse negatives (missed 
20 mitotio cens) and false positives (fefeely-detected non^ftotlc cells). In the pnaferred 
embodiment the resuKina cosfflclents ^that have been derived from this training 
slag© are 1-0.87431, 0.10205. 0.84614. .ai8744. -0.04954 -5.56334]. Figure 5 
iDuslratea the dasaifisr together with the data on which it was trained. v»rtiere 
plusses indioate mftoHc tells and croases indicate non-mltofic cells. 

26 Mitosis count 

Stag© 27 counts the number of objecta deemed to represent the nuclei of mitolte 
cells, that is to say only those objects vwhose values exceed a given threshold in 
the output of the Rsher ctassffler. The preferred criterion is P>0 (pHustrated as the 
dedaon boundary In Rgure 5>, set to give the optimum trade-off between missed 
30 mitotle cells and f&lsely-deteded non-mltotic cells. The number of objects whose 



^^^^ 
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Classifier value exceeds this threshold d^nes the mitctic count for that tile. The 
count for the ten tiles analysed is aggregst$d, end can be converted into an 
indication of cancer grade In eccordanoe with a t^le a? previously described in 
connection the existing manual pn^cedure. 

It will be appreciated that the coding of a computer program to implement all of the 
processing stages described at>ove for tine prefened embodiment of the invention 
can be achieved by a skilled programmer In accordance with conventional 
techniques. Such a program and code will therefons not be de$oibed further. 
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Ci-AIMS 

1. A method for the automSted analysis of a digital Image compridng an array 
of pixels, includir^ the stepd of: 

(a) Identifying the locations of objects mthf n the Imag^ which have 
5 specified intensity and size charact^dstios; 

(b) defining regions of ^ectfied extent within the Image which contain 
respective said objects; 

(o) deriving from the data within nespedive said regions one or more 
respective closed contours comprising points of equal intensities; and 
10 (d) estimating the curvature of at least one respective said contoLir 

within respective said regions at least to produce a measure of any concavity 



2. A method according to claim 1 wherein step (a) comprises the application of 
15 a radially-symmetric diffeinnce filter vrith zero mean. 

3. i A method aMordIng to daim Z wherein the image Altered at a plurality of 
re$oIutions of fncreasing scale, 

20 4. A method according to claim 2 or daim 3 wherein said locations are 

Identified in accordance with the bcations of respective local extreme in the output 
of said fitter. 

5. A method according to claim 4 including the step of sorting, In order of 
25 intensity, local extrsma m the output of said filter and selecting for further analy^s 

only those objects which correspond to a specified proportion of said extrema in 
such order. 

6. A method according to any prsceding daim further comprising, following 
30 step (a): 

selecting an intensity threshold related to the mean intensity of pixels within 
the image In neighbourhoods of said locations; 

creating a tdnary image according to whether pixels In tte first-mentioned 
Image are at>ove or below said threshold; 
35 identiiVIng felons in the binary image composed of connected pbcels which 

are below »id threshold in the first-mentioned image; and 



thereof- 
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reiecBng from fUrttier analysis those objects which correspond to such 
regions in the binary Image Which fell below a specified eize or thickness. 

7 A method according to any preceding claim wherein step (c) comprises, for 
5 respective said regions, deriving respecUve fir^t and second said contours having 
respectivery lower and higher resolutions, deteimining whether the aizaa and 
locations of said first and second contours are consistent within specified criteria 
and, if so conststent, selecting said second contour for step (d). 

10 8. A method acconling to olaim 7 u-herein, for respective said regions, the first 

said contour is derived by: 

seeking within the region one or more contours of respective specifted 

. intensities; 

determining whether the or each such contour is a dosed contour and 
1 5 meets specified looaBon, eize and/or intensity orientation criteria; and 

if more than on© such contour a closed contour and meets auch criteria, 
selecting frtmi the same the contour of the lowest Intensity. 

9. A method according to claim S wherein said specified Intensities are no 
20 greater than that which corresponds to the contour of highest edge strength within 

the respective regloa 

10. A method according to claim 9 when appended to any one of claims 2 to 5 
wherein said first contour is derived by seeldng one or more contours in the output 

25 of said filter for the respective region and said specified intensities are no greater 
than the zero level in auch ou^ut. 

11. A method according to any one of dalma S to 10 wherein, for respective 
said regions, the second said contour ts derived by: 

30 seeking within the region a plurality of contoure of respecllve specified 

Intensities ranging between the loweat and highest intensities vi*Wn the region; 

detarminlng whether each such oonlour is a dosed contour and meets 
specified location, size and/or Inbansfty orientation criteria: and 

If more than one euch contour is a dosed contour and meets such criteria, 
35 Baloctmg from the same the contour having the highest e<fee strength. 
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12 A method according to any preceding daim wherein step (d) Includes the 
application of a ProbabtJily Density Association Fitter to respective said oontours. 



13. 



A method according to any precocfing daim wherein step Cd) comprise, for 

5 respective said contours: 

measuring the curvaturt of the contour at a plurality of points anjund the 

contour, convexity and concavity being of opposite sign; 
setting convex values of such curvature to zero; 

plotting resultant values of cuivature at said points against a measure of the 
10 distance of the respective point along the contour; arwl 

computing as said measure of concavity the line Integral of auoh plot 

14. A method according to any preceding daim further comprising the step of: 
(e) dasslfVing efcHects into one of at least two dasaes in accordance with a 

15 funcHon of said measure of concavity of a contour corresponding to the respective 
object and a maasufB of the mean Intensity of the respedive objed. 

15. A method acoortllng to daim 14 wherein step (e) la perfbrmed by use of a 
Fisher dasslfier. 

20 

16. A method accoidlng to daim 14 or daim 15 wherein the intensities of 
respective objects are nomialised prior to step (e). 

17. A method accoidlng to any one of daims 14 to 16 further comprising the 
25 Step of: 

(f) counting the number of obleds dasslfied into a specified one of said 
dasses. 

ia. A method according to any preceding daim for the automated analysis of a 
30 digital Image of a hiatotoglcal or cytology spsdmen. 

ig. A method accoidlng to daim 18 wherein the Image Is of a aedJon of breast • 
tissue. 

36 . 20. A method according to daim 18 or daim 19 when appended to daim 17. 
Wherein said spedfied dasa is identified as the dasa of mitotic epithelial call nudel. 
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21. A method according to any one of claims 1 to 17 for the automated analysis 
or a digital Image of a soil sample. 

22. A method for th6 automated identification of mitotic acHvtty from a digital 
5 image of a histological specimen, including the steps of: 

(a) Identifying the locations of objects within the Image which have 
specified intensity and si^e characteristics assocfetad with epithelial ceO nuclei; 

(b) defining regions of specified extent within the image whteh cantain 

respective said objects; 
1 0 (o) deriving ftom the data within respective said regions one or more 

reapeetivd dosed contours comprising points of equal IntenslBes; 

(d) estimating the curvature of at least one respective said contour 
within respective said regions at least to produce a measure of any concavity 
thereof, and 

,5 (e) classl^ing objects as representing mltofio ceil nuclei as a function of 

at least said measure of concavity of a contour corresponding to the respective 
ob^ct 

23. A method for the automated analysis of a digital image substantially as 
20 hereinbefore described v^th referenca to the accompanying drawings. 

24. Apparatus for the automated analysis of a digital Image comprising means 
adopted to perform & method according to any preceding daim. 

25 25. A computer program product comprising a computer readable medium 
having thereon computer program code mesHis adapted to causa a computer to 
execute a method according to any one of dadms 1 to 23. 

26. A computer program eomprialng Inslruotlons to cause a computer to 
30 execute a method accoRfing to any one of claims 1 to 23. 

27. Any novel and invenflve feature or oomMnaton of features disclosed 
herein. 
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5 A method for the automated analysis of digital Images, particularly for the purpose of 
assaying mltotlo adJvlty from Imaa^s of histological slides for prognostication of breast 
cancer. The method includes the steps of identilVing the locations of objects within the 
image which have mter^sity and size characteristics consistent with mitotic epithelial eel! 
nuclei, taking the darlcest 10% of those objects, deriving contours Indicating their 

10 boundary shape, and smoothing and measuring tha cuivature around the boundaries 
using a ProbabiOty Density Association Filter (PDAF). The PDAF output Is used to 
compute a measure of any concavity of the bourtdary - a good Indicator of mitosis. 
Objects are finally classified as representing mitotic nudei or not. as a function of. 
boundary concavity and mean intensity, by use of a Fisher classifier trained on known 

IS examples. Other uses for the method could include the analysis of Images of soil 
samples containing certain types of seeds or other particles. 



j=igure 2 refers. 
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